Convex Functions
We begin with a definition of the graph of a real-valued function.
Definition
The graph of f:Ω→R,Ω⊂Rn, is the set of points in Ω×R⊂Rn+1 given by
{[xf(x)]:x∈Ω}. We can visualize the graph of f as simply the set of points on a "plot" of f(x) versus x.
Definition
The epigraph of a function f:Ω→R,Ω⊂Rn, denoted epi (f), is the set of points in Ω×R given by
epi(f)={[xβ]:x∈Ω,β∈R,β≥f(x)}. The epigraph epi (f) of a function f is simply the set of points in Ω×R on or above the graph of f. We can also think of epi (f) as a subset of Rn+1.
Recall that a set Ω⊂Rn is convex if for every x1,x2∈Ω and α∈(0,1), αx1+(1−α)x2∈Ω. We now introduce the notion of a convex function.
Definition
A function f:Ω→R,Ω⊂Rn, is convex on Ω if its epigraph is a convex set.
Theorem
If a function f:Ω→R,Ω⊂Rn, is convex on Ω, then Ω is a convex set.
Proof
We prove this theorem by contraposition. Suppose that Ω is not a convex set. Then, there exist two points y1 and y2 such that for some α∈ (0,1)
z=αy1+(1−α)y2∈/Ω. Let
β1=f(y1),β2=f(y2). Then, the pairs
[y1β1],[y2β2] belong to the graph of f, and hence also the epigraph of f. Let
w=α[y1β1]+(1−α)[y2β2] We have
w=[zαβ1+(1−α)β2] But note that w∈/ epi (f), because z∈/Ω. Therefore, epi (f) is not convex, and hence f is not a convex function.
The next theorem gives a very useful characterization of convex functions. This characterization is often used as a definition for a convex function.
Theorem
A function f:Ω→R defined on a convex set Ω⊂Rn is convex if and only if for all x,y∈Ω and all α∈(0,1), we have
f(αx+(1−α)y)≤αf(x)+(1−α)f(y) Proof
⇐ : Assume that for all x,y∈Ω and α∈(0,1)
f(αx+(1−α)y)≤αf(x)+(1−α)f(y). Let [x⊤,a]⊤ and [y⊤,b]⊤ be two points in epi (f), where a,b∈R. From the definition of epi (f) it follows that
f(x)≤a,f(y)≤b Therefore, using the first inequality above, we have
f(αx+(1−α)y)≤αa+(1−α)b. Because Ω is convex, αx+(1−α)y∈Ω. Hence,
[αx+(1−α)yαa+(1−α)b]∈epi(f) which implies that epi (f) is a convex set, and hence f is a convex function.
⇒ : Assume that f:Ω→R is a convex function. Let x,y∈Ω and
f(x)=a,f(y)=b. Thus,
[xa],[yb]∈epi(f) Because f is a convex function, its epigraph is a convex subset of Rn+1. Therefore, for all α∈(0,1), we have
α[xa]+(1−α)[yb]=[αx+(1−α)yαa+(1−α)b]∈epi(f) The above implies that for all α∈(0,1),
f(αx+(1−α)y)≤αa+(1−α)b=αf(x)+(1−α)f(y) This completes the proof.
A geometric interpretation of the theorem is given below. The theorem states that if f:Ω→R is a convex function over a convex set Ω, then for all x,y∈Ω, the points on the line segment in Rn+1 connecting [x⊤,f(x)]⊤ and [y⊤,f(y)]⊤ must lie on or above the graph of f.
Using the above theorem, it is straightforward to show that any nonnegative scaling of a convex function is convex, and that the sum of convex functions is convex.
Theorem
Suppose that f,f1, and f2 are convex functions. Then, for any a≥0, the function af is convex. Moreover, f1+f2 is convex.
Proof
Let x,y∈Ω and α∈(0,1). Fix a≥0. For convenience, write fˉ=af. We have
fˉ(αx+(1−α)y)=af(αx+(1−α)y)≤a(αf(x)+(1−α)f(y)) because f is convex and a≥0=α(af(x))+(1−α)(af(y))=αfˉ(x)+(1−α)fˉ(y) which implies that fˉ is convex.
Next, write f3=f1+f2. We have
f3(αx+(1−α)y)=≤==f1(αx+(1−α)y)+f2(αx+(1−α)y)(αf1(x)+(1−α)f1(y))+(αf2(x)+(1−α)f2(y)) by convexity of f1 and f2α(f1(x)+f2(x))+(1−α)(f1(y)+f2(y))αf3(x)+(1−α)f3(y), which implies that f3 is convex.
This theorem implies that for any given collection of convex functions f1,…,fℓ and nonnegative numbers c1,…,cℓ, the function c1f2+⋯+cℓfℓ is convex. It is similarly straightforward to show that the function max{f1,…,fℓ} is convex.
We now define the notion of strict convexity.
Definition
A function f:Ω→R on a convex set Ω⊂Rn is strictly convex if for all x,y∈Ω,x=y, and α∈(0,1), we have
f(αx+(1−α)y)<αf(x)+(1−α)f(y). From this definition, we see that for a strictly convex function, all points on the open line segment connecting the points [x⊤,f(x)]⊤ and [y⊤,f(y)]⊤ lie (strictly) above the graph of f.
Definition
A function f:Ω→R on a convex set Ω⊂Rn is (strictly) concave if −f is (strictly) convex.
A convex optimization problem is one of the form
minimize subject to f0(x)fi(x)≤0,i=1,…,maiTx=bi,i=1,…,p, where f0,…,fm are convex functions. The convex problem has three additional requirements:
- the objective function must be convex,
- the inequality constraint functions must be convex,
- the equality constraint functions hi(x)=aiTx−bi must be affine.